Using the Student’s t-test with extremely small sample sizes
نویسنده
چکیده
Researchers occasionally have to work with an extremely small sample size, defined herein as N ≤ 5. Some methodologists have cautioned against using the t-test when the sample size is extremely small, whereas others have suggested that using the t-test is feasible in such a case. The present simulation study estimated the Type I error rate and statistical power of the oneand two-sample ttests for normally distributed populations and for various distortions such as unequal sample sizes, unequal variances, the combination of unequal sample sizes and unequal variances, and a lognormal population distribution. Ns per group were varied between 2 and 5. Results show that the t-test provides Type I error rates close to the 5% nominal value in most of the cases, and that acceptable power (i.e., 80%) is reached only if the effect size is very large. This study also investigated the behavior of the Welch test and a rank-transformation prior to conducting the t-test (t-testR). Compared to the regular t-test, the Welch test tends to reduce statistical power and the t-testR yields false positive rates that deviate from 5%. This study further shows that a paired t-test is feasible with extremely small Ns if the within-pair correlation is high. It is concluded that there are no principal objections to using a t-test with Ns as small as 2. A final cautionary note is made on the credibility of research findings when sample sizes are small.
منابع مشابه
Economic Statistical Design of Multivariate T^2 Control Chart with Variable Sample Sizes
Today, quality improvement and cost reduction are key factors for achieving business success, growth and position. One of the primary tools for quality improvement and cost reduction in online activities of statistical process control is control charts. As the need for monitoring several correlated quality characteristics is extensively growing, the use of multivariate control charts become...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملRELATIVE ERRORS IN CENTRAL LIMIT THEOREMS FOR STUDENT’S t STATISTIC, WITH APPLICATIONS
Student’s t statistic is frequently used in practice to test hypotheses about means. Today, in fields such as genomics, tens of thousands of t-tests are implemented simultaneously, one for each component of a long data vector. The distributions from which the t statistics are computed are almost invariably nonnormal and skew, and the sample sizes are relatively small, typically about one thousa...
متن کاملOr Bootstrap Calibration Be Applied?
In the analysis of microarray data, and in some other contemporary statistical problems, it is not uncommon to apply hypothesis tests in a highly simultaneous way. The number, N say, of tests used can be much larger than the sample sizes, n, to which the tests are applied, yet we wish to calibrate the tests so that the overall level of the simultaneous test is accurate. Often the sampling distr...
متن کاملModified signed log-likelihood test for the coefficient of variation of an inverse Gaussian population
In this paper, we consider the problem of two sided hypothesis testing for the parameter of coefficient of variation of an inverse Gaussian population. An approach used here is the modified signed log-likelihood ratio (MSLR) method which is the modification of traditional signed log-likelihood ratio test. Previous works show that this proposed method has third-order accuracy whereas the traditi...
متن کامل